Search CORE

15 research outputs found

Parallel and Flow-Based High Quality Hypergraph Partitioning

Author: Gottesbüren Lars
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 16/04/2023
Field of study

Balanced hypergraph partitioning is a classic NP-hard optimization problem that is a fundamental tool in such diverse disciplines as VLSI circuit design, route planning, sharding distributed databases, optimizing communication volume in parallel computing, and accelerating the simulation of quantum circuits. Given a hypergraph and an integer

k

, the task is to divide the vertices into

k

disjoint blocks with bounded size, while minimizing an objective function on the hyperedges that span multiple blocks. In this dissertation we consider the most commonly used objective, the connectivity metric, where we aim to minimize the number of different blocks connected by each hyperedge. The most successful heuristic for balanced partitioning is the multilevel approach, which consists of three phases. In the coarsening phase, vertex clusters are contracted to obtain a sequence of structurally similar but successively smaller hypergraphs. Once sufficiently small, an initial partition is computed. Lastly, the contractions are successively undone in reverse order, and an iterative improvement algorithm is employed to refine the projected partition on each level. An important aspect in designing practical heuristics for optimization problems is the trade-off between solution quality and running time. The appropriate trade-off depends on the specific application, the size of the data sets, and the computational resources available to solve the problem. Existing algorithms are either slow, sequential and offer high solution quality, or are simple, fast, easy to parallelize, and offer low quality. While this trade-off cannot be avoided entirely, our goal is to close the gaps as much as possible. We achieve this by improving the state of the art in all non-trivial areas of the trade-off landscape with only a few techniques, but employed in two different ways. Furthermore, most research on parallelization has focused on distributed memory, which neglects the greater flexibility of shared-memory algorithms and the wide availability of commodity multi-core machines. In this thesis, we therefore design and revisit fundamental techniques for each phase of the multilevel approach, and develop highly efficient shared-memory parallel implementations thereof. We consider two iterative improvement algorithms, one based on the Fiduccia-Mattheyses (FM) heuristic, and one based on label propagation. For these, we propose a variety of techniques to improve the accuracy of gains when moving vertices in parallel, as well as low-level algorithmic improvements. For coarsening, we present a parallel variant of greedy agglomerative clustering with a novel method to resolve cluster join conflicts on-the-fly. Combined with a preprocessing phase for coarsening based on community detection, a portfolio of from-scratch partitioning algorithms, as well as recursive partitioning with work-stealing, we obtain our first parallel multilevel framework. It is the fastest partitioner known, and achieves medium-high quality, beating all parallel partitioners, and is close to the highest quality sequential partitioner. Our second contribution is a parallelization of an n-level approach, where only one vertex is contracted and uncontracted on each level. This extreme approach aims at high solution quality via very fine-grained, localized refinement, but seems inherently sequential. We devise an asynchronous n-level coarsening scheme based on a hierarchical decomposition of the contractions, as well as a batch-synchronous uncoarsening, and later fully asynchronous uncoarsening. In addition, we adapt our refinement algorithms, and also use the preprocessing and portfolio. This scheme is highly scalable, and achieves the same quality as the highest quality sequential partitioner (which is based on the same components), but is of course slower than our first framework due to fine-grained uncoarsening. The last ingredient for high quality is an iterative improvement algorithm based on maximum flows. In the sequential setting, we first improve an existing idea by solving incremental maximum flow problems, which leads to smaller cuts and is faster due to engineering efforts. Subsequently, we parallelize the maximum flow algorithm and schedule refinements in parallel. Beyond the strive for highest quality, we present a deterministically parallel partitioning framework. We develop deterministic versions of the preprocessing, coarsening, and label propagation refinement. Experimentally, we demonstrate that the penalties for determinism in terms of partition quality and running time are very small. All of our claims are validated through extensive experiments, comparing our algorithms with state-of-the-art solvers on large and diverse benchmark sets. To foster further research, we make our contributions available in our open-source framework Mt-KaHyPar. While it seems inevitable, that with ever increasing problem sizes, we must transition to distributed memory algorithms, the study of shared-memory techniques is not in vain. With the multilevel approach, even the inherently slow techniques have a role to play in fast systems, as they can be employed to boost quality on coarse levels at little expense. Similarly, techniques for shared-memory parallelism are important, both as soon as a coarse graph fits into memory, and as local building blocks in the distributed algorithm

KITopen

Deterministic Parallel Hypergraph Partitioning

Author: Gottesbüren Lars
Hamann Michael
Publication venue
Publication date: 01/01/2021
Field of study

Balanced hypergraph partitioning is a classical NP-hard optimization problem with applications in various domains such as VLSI design, simulating quantum circuits, optimizing data placement in distributed databases or minimizing communication volume in high-performance computing. Engineering parallel heuristics for this problem is a topic of recent research. Most of them are non-deterministic though. In this work, we design and implement a highly scalable deterministic algorithm in the state-of-the-art parallel partitioning framework Mt-KaHyPar. On our extensive set of benchmark instances, it achieves similar partition quality and performance as a comparable but non-deterministic configuration of Mt-KaHyPar and outperforms the only other existing parallel deterministic algorithm BiPart regarding partition quality, running time and parallel speedups

arXiv.org e-Print Archive

KITopen

Parallel Flow-Based Hypergraph Partitioning

Author: Gottesbüren Lars
Heuer Tobias
Sanders Peter
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH
Publication date: 17/11/2022
Field of study

We present a shared-memory parallelization of flow-based refinement, which is considered the most powerful iterative improvement technique for hypergraph partitioning at the moment. Flow-based refinement works on bipartitions, so current sequential partitioners schedule it on different block pairs to improve k-way partitions. We investigate two different sources of parallelism: a parallel scheduling scheme and a parallel maximum flow algorithm based on the well-known push-relabel algorithm. In addition to thoroughly engineered implementations, we propose several optimizations that substantially accelerate the algorithm in practice, enabling the use on extremely large hypergraphs (up to 1 billion pins). We integrate our approach in the state-of-the-art parallel multilevel framework Mt-KaHyPar and conduct extensive experiments on a benchmark set of more than 500 real-world hypergraphs, to show that the partition quality of our code is on par with the highest quality sequential code (KaHyPar), while being an order of magnitude faster with 10 threads

KITopen

Evaluation of a Flow-Based Hypergraph Bipartitioning Algorithm

Author: Gottesbüren Lars
Hamann Michael
Wagner Dorothea
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH
Publication date: 01/01/2019
Field of study

In this paper, we propose HyperFlowCutter, an algorithm for balanced hypergraph bipartitioning that is based on minimum S-T hyperedge cuts and maximum flows. It computes a sequence of bipartitions that optimize cut size and balance in the Pareto sense, being able to trade one for the other. HyperFlowCutter builds on the FlowCutter algorithm for partitioning graphs. We propose additional features, such as handling disconnected hypergraphs, novel methods for obtaining starting S,T pairs as well as an approach to refine a given partition with HyperFlowCutter. Our main contribution is ReBaHFC, a new algorithm which obtains an initial partition with the fast multilevel hypergraph partitioner PaToH and then improves it using HyperFlowCutter as a refinement algorithm. ReBaHFC is able to significantly improve the solution quality of PaToH at little additional running time. The solution quality is only marginally worse than that of the best-performing hypergraph partitioners KaHyPar and hMETIS, while being one order of magnitude faster. Thus ReBaHFC offers a new time-quality trade-off in the current spectrum of hypergraph partitioners. For the special case of perfectly balanced bipartitioning, only the much slower plain HyperFlowCutter yields slightly better solutions than ReBaHFC, while only PaToH is faster than ReBaHFC

arXiv.org e-Print Archive

KITopen

Dagstuhl Research Online Publication Server

Faster and better nested dissection orders for Customizable Contraction Hierarchies

Author: Gottesbüren Lars
Hamann Michael
Uhl Tim Niklas
Wagner Dorothea
Publication venue: MDPI
Publication date: 02/07/2019
Field of study

Graph partitioning has many applications. We consider the acceleration of shortest path queries in road networks using Customizable Contraction Hierarchies (CCH). It is based on computing a nested dissection order by recursively dividing the road network into parts. Recently, with FlowCutter and Inertial Flow, two flow-based graph bipartitioning algorithms have been proposed for road networks. While FlowCutter achieves high-quality results and thus fast query times, it is rather slow. Inertial Flow is particularly fast due to the use of geographical information while still achieving decent query times. We combine the techniques of both algorithms to achieve more than six times faster preprocessing times than FlowCutter and even faster queries on the Europe road network. We show that, using 16 cores of a shared-memory machine, this preprocessing needs four minutes

arXiv.org e-Print Archive

KITopen

PACE solver description: The KaPoCE exact cluster editing algorithm

Author: Bläsius Thomas
Fischbeck Philipp
Gottesbüren Lars
Hamann Michael
Heuer Tobias
Spinner Jonas
Weyand Christopher
Wilhelm Marcus
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH
Publication date: 23/12/2021
Field of study

The cluster editing problem is to transform an input graph into a cluster graph by performing a minimum number of edge editing operations. A cluster graph is a graph where each connected component is a clique. An edit operation can be either adding a new edge or removing an existing edge. In this write-up we outline the core techniques used in the exact cluster editing algorithm of the KaPoCE framework (contains also a heuristic solver), submitted to the exact track of the 2021 PACE challenge

KITopen

PACE solver description: KaPoCE: A heuristic cluster editing algorithm

Author: Bläsius Thomas
Fischbeck Philipp
Gottesbüren Lars
Hamann Michael
Heuer Tobias
Spinner Jonas
Weyand Christopher
Wilhelm Marcus
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH
Publication date: 23/12/2021
Field of study

The cluster editing problem is to transform an input graph into a cluster graph by performing a minimum number of edge editing operations. A cluster graph is a graph where each connected component is a clique. An edit operation can be either adding a new edge or removing an existing edge. In this write-up we outline the core techniques used in the heuristic cluster editing algorithm of the Karlsruhe and Potsdam Cluster Editing (KaPoCE) framework, submitted to the heuristic track of the 2021 PACE challenge

KITopen

A Branch-And-Bound Algorithm for Cluster Editing

Author: Bläsius Thomas
Fischbeck Philipp
Gottesbüren Lars
Hamann Michael
Heuer Tobias
Schulz Christian
Spinner Jonas
Uçar Bora
Weyand Christopher
Wilhelm Marcus
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH
Publication date: 28/07/2022
Field of study

The cluster editing problem asks to transform a given graph into a disjoint union of cliques by inserting and deleting as few edges as possible. We describe and evaluate an exact branch-and-bound algorithm for cluster editing. For this, we introduce new reduction rules and adapt existing ones. Moreover, we generalize a known packing technique to obtain lower bounds and experimentally show that it contributes significantly to the performance of the solver. Our experiments further evaluate the effectiveness of the different reduction rules and examine the effects of structural properties of the input graph on solver performance. Our solver won the exact track of the 2021 PACE challenge

KITopen

Open Problems in (Hyper)Graph Decomposition

Large networks are useful in a wide range of applications. Sometimes problem instances are composed of billions of entities. Decomposing and analyzing these structures helps us gain new insights about our surroundings. Even if the final application concerns a different problem (such as traversal, finding paths, trees, and flows), decomposing large graphs is often an important subproblem for complexity reduction or parallelization. This report is a summary of discussions that happened at Dagstuhl seminar 23331 on "Recent Trends in Graph Decomposition" and presents currently open problems and future directions in the area of (hyper)graph decomposition

arXiv.org e-Print Archive